-
Notifications
You must be signed in to change notification settings - Fork 48
Conversation
assets/terraform-modules/bare-metal/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
180f662
to
02acf75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#449 would be nice 😢
Overall LGTM, left some questions.
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Outdated
Show resolved
Hide resolved
assets/terraform-modules/packet/flatcar-linux/kubernetes/workers/cl/worker.yaml.tmpl
Show resolved
Hide resolved
02acf75
to
a360007
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
ExecStart=sh -c "docker run --network=host \ | ||
-u $(id -u \"$${ETCD_USER}\"):$(id -u \"$${ETCD_USER}\") \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we remove the sh -c
?
I guess it is used for the id -u
part. But docker takes a string for -u
param. Do we really need to run that? Even if we do, can't we add a variable for that in the env file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it should be possible to remove it. Good point @rata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr; string user names don't work on flatcar.
On Flatcar when I do this I get the user id:
# id -u etcd
232
But there is no entry for that user in /etc/passwd
:
# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
core:x:500:500:Flatcar Admin:/home/core:/bin/bash
systemd-timesync:x:997:997:systemd Time Synchronization:/:/sbin/nologin
systemd-coredump:x:996:996:systemd Core Dumper:/:/sbin/nologin
So docker fails:
# docker run -u etcd fedora bash
/run/torcx/bin/docker: Error response from daemon: linux spec user: unable to find user etcd: no matching entries in passwd file.
ERRO[0000] error waiting for container: context canceled
There is a workaround you can use which is echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd
but this does not work as well.
So there is another passwd file at play here: https://github.com/flatcar-linux/baselayout/blob/flatcar-master/baselayout/passwd, this is where we get the user id of etcd from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a workaround you can use which is echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd but this does not work as well.
Can't we create the user using ignition too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added following line, but there was no effect:
diff --git a/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl b/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controll
er.yaml.tmpl
index fd50be82a..e911708ee 100644
--- a/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
+++ b/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
@@ -252,3 +252,4 @@ passwd:
users:
- name: core
ssh_authorized_keys: ${ssh_keys}
+ - name: etcd
on the controller host:
# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
core:x:500:500:Flatcar Admin:/home/core:/bin/bash
systemd-timesync:x:997:997:systemd Time Synchronization:/:/sbin/nologin
systemd-coredump:x:996:996:systemd Core Dumper:/:/sbin/nologin
Also if I manually try to add the user it does not work, because the user already exists:
# useradd etcd
useradd: user 'etcd' already exists
The user is already there in:
# cat /usr/share/baselayout/passwd | grep etcd
etcd:x:232:232::/dev/null:/sbin/nologin
But docker does not identify this, hence the workaround.
I had a chat with Thilo and this is what he had to add:
Baselayout provides
/etc/passwd
for theinitrd
, and includes theetcd
user. The/etc/passwd
is the runtime configuration, which is generated from the systemd build recipe.
- baselayout (w/
etcd
user: https://github.com/flatcar-linux/baselayout/blob/flatcar-master/baselayout/passwd).- systemd build recipe in
coreos-overlay
: https://github.com/flatcar-linux/coreos-overlay/blob/main/sys-apps/systemd/systemd-9999.ebuild#L534-L542For a quick patch, please try and work around this by issuing
echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwdIn the long term we should really only have one authoritative source for passwd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to investigate it later, I think it is cleaner to create a variable in the env-file with the UID and use that variable here, instead of running a shell to run id -u.
That's a lot of indirection then to have separate unit to write env file etc. IMO the current way is simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, if we need a new unit for this (I thought we had one to create those things already) then this way is not so bad :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so can we resolve this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @rata suggested to either investigate deeper why we can't properly use etcd user on Flatcar OR create an issue for it to investigate later (though honestly, this seems like an issue we will never go back to, but I don't mind having it created).
And as creating new unit to create env file seems complicated, I have a feeling that currently approach with sh
is okay'ish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docker takes the user ID from the image itself. If the image specifies an etcd user, this has probably not the same user ID as the one Flatcar uses, therefore, the current way makes sense as long as you want to depend on the basic Flatcar etcd setup.
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Outdated
Show resolved
Hide resolved
-v /etc/kubernetes:/etc/kubernetes:ro \ | ||
-v /etc/machine-id:/etc/machine-id \ | ||
-v /lib/modules:/lib/modules \ | ||
-v /run:/run:rw \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not rw just to a specific path only for the kubelet files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It uses unspecified files from here and a lock file:
# lsof | grep '/run' | grep kubelet
kubelet 10931 root 5uW REG 0,21 0 30352 /run/lock/kubelet.lock
kubelet 10931 root 10u unix 0xffff9fd3b65b6400 0t0 311644 /var/run/661904757 type=STREAM
kubelet 10931 root 14u unix 0xffff9fd3b65b4c00 0t0 311194 /var/run/661904757 type=STREAM
kubelet 10931 root 20u unix 0xffff9fd3b65b5400 0t0 311195 /var/run/661904757 type=STREAM
kubelet 10931 root 31u unix 0xffff9fd2b4d93800 0t0 311748 /var/run/661904757 type=STREAM
kubelet 10931 10948 root 5uW REG 0,21 0 30352 /run/lock/kubelet.lock
kubelet 10931 10948 root 10u unix 0xffff9fd3b65b6400 0t0 311644 /var/run/661904757 type=STREAM
kubelet 10931 10948 root 14u unix 0xffff9fd3b65b4c00 0t0 311194 /var/run/661904757 type=STREAM
kubelet 10931 10948 root 20u unix 0xffff9fd3b65b5400 0t0 311195 /var/run/661904757 type=STREAM
kubelet 10931 10948 root 31u unix 0xffff9fd2b4d93800 0t0 311748 /var/run/661904757 type=STREAM
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and the path can't be configured to use another one? If we can lock it down more, IMHO seems better. But we were running like this before with rkt, so no biggie :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, /run/lock
must be shared via host path between host kubelet and self-hosted one. But /var/run
is separate and only in the container, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Between /var
and /var/run
, one is symlink of other
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
a10f0d7
to
f760a3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, which might not be relevant for now. Overall the PR looks good to me 👍
I'd also like to have opened conversations resolved before we merge it.
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Outdated
Show resolved
Hide resolved
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
fc76aa7
to
a02cf73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits regarding newly added comments.
a02cf73
to
283e863
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a reference, that this PR also addresses #917.
283e863
to
a9224ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overall LGTM (some small nit-picking comments here&there), but there is one unknown that worries me a little bit: https://github.com/kinvolk/lokomotive/pull/946/files#r494871543
TimeoutStartSec=0 | ||
LimitNOFILE=40000 | ||
EnvironmentFile=/etc/kubernetes/etcd.env | ||
ExecStart=sh -c "docker run --network=host \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this might refer to the same problem the link from systemd-docker alban posted mention, but don't we want to exec docker ...
instead of just running it? I'm not sure if systemd will handle correctly a child for this process (without exec, the shell forks and executes).
I ignore about systemd to know if exec
or not will be better. But seems something that can be relevant. Did you chose this for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current implementation now, no matter who kills kubelet systemd will always restart the process again.
a9224ed
to
9427c7b
Compare
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Show resolved
Hide resolved
9427c7b
to
ee89965
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one Q, otherwise LGTM.
assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
Outdated
Show resolved
Hide resolved
7f0f2ff
to
f6cf4ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This commit removes rkt and uses docker to start services. etcd(for controllers only): - Move the env var to a separate env var file. - Add etcd service file. - Change the name of service from etcd-member to etcd. Bootkube(for controllers only): Use `docker run` instead of rkt. Kubelet(for controllers and workers): Remove some of the folder creation in `ExecStartPre`, these are automatically created by docker, when mounted using `-v` flag. delete-node(for workers): Use `docker run` instead of rkt. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
This commit removes rkt and uses docker to start services. etcd(for controllers only): - Move the env var to a separate env var file. - Add etcd service file. - Change the name of service from etcd-member to etcd. Bootkube(for controllers only): Use `docker run` instead of rkt. Kubelet(for controllers and workers): Remove some of the folder creation in `ExecStartPre`, these are automatically created by docker, when mounted using `-v` flag. delete-node(for workers): Use `docker run` instead of rkt. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Fix the env var file with change in removal of rkt. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
This commit removes rkt and uses docker to start services. etcd(for controllers only): - Move the env var to a separate env var file. - Add etcd service file. - Change the name of service from etcd-member to etcd. Bootkube(for controllers only): Use `docker run` instead of rkt. Kubelet(for controllers and workers): Remove some of the folder creation in `ExecStartPre`, these are automatically created by docker, when mounted using `-v` flag. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
f6cf4ca
to
b45468f
Compare
Thanks everyone for your reviews 🎉 |
This commit removes rkt and uses docker to start services.
etcd(for controllers only):
Bootkube(for controllers only):
docker run
instead of rkt.Kubelet(for controllers and workers):
ExecStartPre
, these areautomatically created by docker, when mounted using
-v
flag.delete-node(for workers):
docker run
instead of rkt.Fixes #720
Fixes #917
Release Notes
etcd-member.service
toetcd.service
. Old mechanism of running etcd(using etcd-wrapper) entailed we use the Flatcar shippedetcd-member.service
./etc/kubernetes/etcd.env
on controller hosts.rkt
.